EuroWordNet: A Multilingual Database with Lexical Semantic Networks

نویسندگان

  • Piek Vossen
  • Graeme Hirst
چکیده

WordNet, the on-line English thesaurus and lexical database developed at Princeton University by George Miller and his colleagues (Fellbaum 1998), has proved to be an extremely important resource used in much research in computational linguistics where lexical knowledge of English is required. The goal of the EuroWordNet project is to create similar wordnets for other languages of Europe. The initial four languages are Dutch (at the University of Amsterdam), Italian (CNR, Pisa), Spanish (Fundacion Universidad Empresa), and English (University of Sheffield, adapting the original WordNet); later Czech, Estonian, German, and French will be added. The results of the project will be publicly available. 1 Like the original Princeton WordNet, the new wordnets--that 's now a generic term--are hierarchies in which each node is a synset: a word sense, with which one or more synonymous words or phrases is associated. The synsets are connected by relations such as hyponymy, meronymy, and antonymy. However, some improvements have been made to the original design of WordNet. New relationships, including relationships across parts of speech, have been introduced. For example, the verb adorn and the noun adornment are related by XPOSd~EAR_SYNONYMY; hyponymy and hyperonymy 2 across parts of speech are also permitted. Semantic roles of verbs are marked; for example, the noun student is related to the verb teach by ROLE_PATIENT; the inverse relationship is called INVOLVEDd~ATIENT. Another new relationship, both within and across parts of speech, is causality, which may further be marked as intentional or nonfactive; for example, to redden CAUSES red; to search CAUSES (nonfactive, intentional) to find. Meronymy is much more fine-grained than in Princeton WordNet, with a number of new kinds of part-whole relationships. The most important new development, however, is multilinguality: the use of a common framework to build the individual wordnets and integrate them in a single database in which an inter-lingual-index (ILI) connects the synsets that are "equivalent" in the different languages. EuroWordNet thus becomes a multilingual lexicon and thesaurus that could be used in applications such as multilingual text retrieval and (rather basic) lexical transfer in machine translation. The project has sought to

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexicography in an Interlingual Ontology: An Introduction to EuroWordNet

EuroWordNet is a multilingual lexical database constructed in the wake of WordNet. The ontological structure of the language-dependent layers, analogous to individual WordNets, through the semantic space of the interlingual index and abstract framework of the top level ontologies are examined. The semantic nature of the interlingual lexicon is examined as it applies to Gruber’s principles for t...

متن کامل

The Automatic Mapping of Princeton WordNet Lexical-Conceptual Relations onto the Brazilian Portuguese WordNet Database

Princeton WordNet (WN.Pr) lexical database has motivated efficient compilations of bulky relational lexicons since its inception in the 1980 ́s. The EuroWordNet project, the first multilingual initiative built upon WN.Pr, opened up ways of building individual wordnets, and interrelating them by means of the so-called Inter-Lingual-Index, an unstructured list of the WN.Pr synsets. Other important...

متن کامل

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexical knowledge resource...

متن کامل

A Bottom-up Comparative Study of EuroWordNet and WordNet 3.0 Lexical and Semantic Relations

The paper presents a comparative study of semantic and lexical relations defined and adopted in WordNet and EuroWordNet. This document describes the experimental observations achieved through the analysis of data from different WordNet versions and EuroWordNet distributions for different languages, during the development of JMWNL (Java Multilingual WordNet Library), an extensible multilingual l...

متن کامل

JMWNL: an Extensible Multilingual Library for Accessing Wordnets in Different Languages

In this paper we present JMWNL, a multilingual extension of the JWNL java library, which was originally developed for accessing Princeton WordNet dictionaries. JMWNL broadens the range of JWNL’s accessible resources by covering also dictionaries produced inside the EuroWordNet project. Specific resources, such as language-dependent algorithmic stemmers, have been adopted to cover the diversitie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998